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ABSTRACT 

In the spring of 1991 the first full-scale National 
Household Education Survey (NHES:91) was conducted for the National 
Center for Education Statistics (NCES). The NHES:9l was a national 
random digit dial telephone survey of about 14,000 parents of 3- to 
8-year-old children concerning the educational experiences of young 
children. A reinterview program was included to examine the impact of 
measurement errors on estimates of the characteristics of early 
educational experience. For the whole subject population, there were 
between under 10 days and 41 days between completion of the original 
interview and the reinterview. The methodology and results of this 
computer-assisted reinterview program are discussed, comparing 
responses from the original and reconciled interviews. A sample of 
604 cases was selected for reinterviews . Reinterviews were completed 
with 534 households (a response rate of 88 percent). Reinterview 
results are encouraging in that most items included had small to 
moderate measurement errors. For the specific items with high 
measurement errors, potential associated problems included vague or 
ambiguous classifications and lack of parental knowledge about the 
item. One weakness was the limited scope and size of the reinterview 
program. Four tables present analysis data. (Contains 3 references.) 
(SLD) 
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REINTERVIEW PROGRAM FOR THE 
1991 NATIONAL HOUSEHOLD EDUCATION SURVEY 

J. Michael Brick, Westat, Inc. 
Jerry West, National Center for Education Statistics 



1 . Introduction 

In tht Spring of 1991, the first, full-scale National Household Education Survey 
(NHES:91) was conducted for the National Center for Education Statistics (NCES) by Westat, 
Inc. The NHES:91 was a national, random digit dial (RDD) telephone survey of about 60,000 
households designed to estimate characteristics of the educational experiences of young children 
and adults. The survey was conducted using computer-assisted telephone interviewing (CATT). 

A reinterview program was included in the NHES:91 in order to examine the impact 
of measurement errors on the estimates of the characteristics of early educational experience. A 
sample of parents who completed the original telephone interview concerning their 3- to 8-year-old 
child was recontacted and asked to respond to a subset of the questions asked in the original 
interview. The responses to the original interview and the reinterview are the source of the 
statistics on measurement errors presented in this paper. 

The primary objectives of the NHES:91 reinterview program were to identify 
survey items that were not reliable, to quantify the magnitude of the response variance for groups 
of items, and to provide feedback for improving the design of future NHES surveys. Since the 
interviewing was a closely monitored CATI survey conducted at Westat's centralized telephone 
centers, there was no need to use the reinterviews to prevent the falsification of interviews. 

The reinterview program had a goal of completing 500 reinterviews of the nearly 
14,000 interviews of parents of 3- to 8-year-olds. Only a subset of the full set of items included in 
the original interview were included in the reinterview to reduce the burden on the respondents and 
to control the cost of the reinterview. The items selected for the reinterview were ones that were 
important substantively and were not highly dependent on the circumstances surrounding the time 
of the interview. 

Sometimes, respondents give answers during reinterviews that differ from the 
original interview responses. These differences, or discrepancies, could arise as a result of several 
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different causes, and not all discrepancies are errors. In the NHES:91 reinterview program, the 
interviewers attempted to categorize the discrepancies into four categories: 

Circumstances related to the child changed between the time of the first and 
the second interview; both answers, although different, may be correct, 

The original response was recorded (interviewer error) or reported 
(respondent error) incorrectly, 

The reinterview response was recorded or reported incorrectly, 

Both the original and reinterview responses were recorded or reported 
incorrectly. 

Because the reinterview was also computer-assisted, the responses to the original 
interview and reinterview were automatically compared and displayed for the interviewer at the end 
of the reinterview, not after each item was asked. If the reinterview response was incorrect, the 
reconciled value was entered by the interviewer at this time. This paper compares the responses to 
the original interview and the reconciled reinterview, discusses the reliability of the respondent's 
answers, and discusses the reasons for errors. 



2 . Design of the NHES:91 and Reinterview Program 

The NHES:91 was a RDD telephone survey conducted with persons in a sample of 
telephone households in the 50 States and the District of Columbia between February and April of 
1991. The reinterview program of NKES:91 was included for the Early Childhood Education 
(ECE) component of the survey in which the parents of children from 3 to 8 years old were 
interviewed 

The survey covered the noninstitutional civilian population of 3- to 8-year-olds in 
the United States. Since only persons in telephone households were surve /ed, the estimates were 
adjusted so that the totals were consistent with the total number of persons in all households. 
Household screening interviews were completed with 60,314 households, including 13,257 
households with at least one 3- to 8-year-old in the household. A total of 13,892 ECE interviews 
were completed for the survey. The completion rate for the ECE interview, or the percent of 
interviews conducted, was 94 percent The overall response rate for the ECE interview, the 
product of the screening response rate and the ECE completion rate was 76 percent. Further details 
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on the sample design and results of the ECE component of NHES:91 are discussed given in Brick 
et al. (1991). 

A random sample of completed ECE interviews was selected for reinterview. Not 
all ECE interviews were eligible for reinterview. The case was eligible if it met ail of the following 
conditions: 1) the original interview was completed at least 6 weeks after the start of data 
collection, 2) the car* was not included in a special longitudinal sample selected for other purposes; 
3) no more than one case was sampled for reinterview per household; and 4) all other extended 
interviews sampled in the household were complete. 

A sample of 604 cases was selected for reinterviews, and 534 of these were 
completed, for a response rate of 88 percent About half of the nonresponse was due to persons 
who refused to participate in the reinterview. 

The reinterview was originally designed to be conducted 14 days after the 
completion of the original ECE interview. However, toward the end of the data collection period, 
the threshold was reduced in an attempt to complete additional reinterviews. Table 1 below shows 
the number of days between the original interview and the reinterview. 

Table 1. Number of days between completion of original interview and reinterview 



Number of Days 


Frequency 1 


Percent 


less than 10 


21 


4% 


10 to 13 


36 


7 


14 


126 


24 


15 to 20 


261 


49 


21 to 27 


77 


15 


28 to 41 


9 


2 


Total 


| 530 


100 



The reinterview was conducted using the same CAT! system used in the original 
interview. Interviewers read identical items to the parent/guardian who completed the original 
interview. After all of the items for the reinterview were asked, a reconciliation of the original and 



1 The number of days was missing for four of the cases and these are not included in the table. 



reinterview lcsponscs was done automatically by the computer. Up until the end of the interview 
and the appearance of the reconciliation screens, the interviewer was unaware of the responses 
given by the respondent to the original interview. 

As mentioned in the introduction, discrepancies in responses were grouped into 
four categories. A total of 1,618 discrepancies occurred during the 534 reinterviews, or about 3 
per interview. The number of items varied significantly from interview to interview due to skip 
patterns. The reasons for the discrepancies, as reported by the respondent, were distributed across 
the four categories as shown in Table 2. 



Table 2. Number of discrepancies between original and reinterview responses, by reason 



Reason for discrepancies 


Number 
Reported 


Percent 


Child's Situation Changed 
Original Interview Answer was Incorrect 
Reinterview Answer was Incorrect 
Both Interview Answers were Incorrect 
Didn't Know How to Explain Discrepancy 
Some Other Explanation for Change 


207 
1034 
320 
41 
4 

_12 


13% 
64 
20 
3 
<1 
__L 




1,618 


100 



Note, that for the 207 discrepancies where the child's situation changed, the 
reinterview answer and the original answer were not the same, but this was not an indication of an 
error. However, since the difference between the reconciled reinterview responses and the original 
interview response were used to indicate an error in the analysis that follows, these cases 
somewhat inflate the estimates of the measurement errors for the NHES:91. The data could be re- 
analyzed without counting these as errors, but our preliminary analyses of these data indicate that 
the differences are minor in nearly all cases. 

One of the interesting methodological features of the NHES:91 reinterview was the 
fact that the results of the original interview were unknown to the interviewers until the completion 
of the reinterview. If wc assume that the interviewers conducting the reinterviews were of equal 
quality to the original interviewers (a reasonable assumption since the interviewers worked both 
surveys) and that the chance of making an error was equal in both the original and reinterview, we 
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would expect the percent of errors made in the reinterview to approximate the percent of errors 
made in the original interview. This is clearly not the case; in the reconciliation process, about 3 
times as many errors were associated with tl.e original interview as with the reinterview. 

The finding of excess errors in the original interviews is a typical result for 
reinterviews. It has led many designers of reinterview programs to designate a large part of the 
reinterview sample to be conducted without reconciliation, at least partially due to the assumption 
that the interviewers might either perform differemly or use the original values to skew the results 
to improve their (reinterview) performance. These results from a situation in which the interviewer 
does not have any opportunity to glance at the original responses suggest that the role of 
reconciliation in reinterviews may need to be reconsidered. 

It is possible that just knowing that a reconciliation process will follow makes 
interviewers more careful and less prone to error. However, the alternative hypothesis that the 
respondent is the source of this inequality in the assignment of the errors is at least as feasible. In 
other words, respondents may wish to be internally consistent with their latest responses, making 
it more comfortable to report that the original interview is in error. If this hypothesis is correct, 
there are important implications for the design and analysis of reinterview data. 



3. Methods Used for the Analysis of the Reinterview 

The statistics computed to examine various aspects of reporting in the original ECE 
survey and its reinterview are the set of statistics developed for assessing response reliability based 
upon reinterview data. The statistics include the gross difference rate, the net difference rate, and 
the index of inconsistency. 

The gross difference rate measures the proportion of cases that had different 
responses in the two administrations of the interview. The net difference rate measures the bias 
after the offsetting misclassifications have been taken into account The index of inconsistency is a 
less familiar statistic. In some circumstances, the index can be used to measure the proportion of 
the total variability that arises due to random response error. Descriptions of these statistics and 
their interpretation are given by Biemer, et aL (1991). 

These statistics are computed based on the number of sample cases reported as 
having the characteristic in the original survey and in the reinterview. No weights are used in the 
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analysis. The following table shows the general format of the possible reporting outcomes by the 
original interviews and reinterviews, when there are only two response categories for an item. 



Table 3. General format for intcrview-reinterview results 







Original ECE Interview 








Number of cases 
with 
characteristic 


Number of cases 
without 
characteristic 


Total 


Reinterview 


Number of cases 
with a 
characteristic 

Number of cases 
without a 
characteristic 


a 

c 


b 
d 


a + b 

c + d 


Total 




a + c 


b + d 


n=a+b+c+d 



From tables formatted in this fashion it is possible to estimate ~*al characteristics 
relevant to the consistency of the reporting between the original and reintcrvisw. Fur example, the 
off diagonal cells estimate the response that were reported differently in the origntu interview and 
reinterview. 

The definitions of the statistics computed in this report are given below. Note that 
the reinterview responses are taken as the truth or "standard," and original responses are compared 
with the standard. This is appropriate because the reinterview responses are the results of the 
reconciliation process and should be more accurate than the original responses. However, the 
reconciled responses are still not error-free, so estimates of bias are not technically feasible. 

The gross difference rate is equal to the percent of cases reported as having a 
characteristic in the reinterview but reported as not having the characteristic in the original 
interview, plus the percent of cases reported as not having the characteristic in the reinterview but 
having the characteristic in the original interview. It can be represented as: 
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For characteristics that may take more than two values, such as the number of hours of television 
watched by the child, the gross difference is defined as the sum of the off-diagonal elements 
divided by the total sample size. 

The gross difference rate includes differences in both directions, partly or 
substantially offsetting. The net difference rate is the non-offsetting part of the gross difference 
rate. The net difference rate can be written as: 

E = ~ x 100 

For items with multiple response categories, the net difference is defined as the number of cases 
above the main diagonal minus the number of cells below the main diagonal. Items which are 
measured in constant, linear units (e.g., number of hours) and are symmetric about the diagonal 
can be treated in much the same manner as items with only two categories. For other types of 
items, the net difference rate is more of a general indicator of offsetting errors than a direct 
measure. 

The index of inconsistency is equal to 

1 s 2P7TP) x 100 = 2nPa-P) x 100 

where P = ^. As noted by Biemer and Stokes (1991), G/2 can be viewed as a measure of the 
random response variance under certain conditions, and P(l-P) as the total random variance, 
including both random response and sampling error. Under these conditions, I is the proportion of 
total variability contributed by random response error. For categorical data, the index of 
inconsistency measures the impact of misclassification errors on the total variance of an 
observation, and it is not a direct measure of misclassification error. 

The L-fold index of inconsistency is used for items with multiple (L) response 
outcomes. This statistic is basically an average of the ordinary indices of inconsistency computed 
for each two-way layout of the data It is equal to 

f 2 Pkk - 2 P 2 k l 
1 = l0Qx t 1 ' £P k (i-Pk)J 
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where P^is the percent of the total sample in the original interview cell k and the reinterview cell k 
(a main diagonal cell), and Pk is the percent of the sample in the original interview marginal total of 
column k. 

4. Findings 

The sample size, the gross difference rate, the net difference rate and the index of 
inconsistency for the items collected in the reinterview are shown in Table 4. The sample size 
varies from item to item because of skip patterns in the interviews. The table presents the items 
that are common to both the prcprimary (children not yet in first grade) and the primary (children in 
first grade or beyond) interviews, followed by items found only in the preprimary interviews, and 
finally those only in the primary interview. 

Overall Assessment 

Before going into the details of the statistics presented, some comments on the 
overall nature of the response variability are in order. The net difference rate is probably the most 
direct measure of bias of the estimates among the three reinterview statistics presented. For over 
80 percent of the items given in Table 4, the net difference rate is less than 5 percent Only 4 items 
had net differences rates greater than 10 percent, and these four items were restricted to subgroups 
of the set of children with small sample sizes (between 30 and 60 cases). 

The gross difference rate, which includes the non-offsetting errors, follows much 
the same pattern. About three-fourths of the items have gross difference rates that are less than 10 
percent. Of all the items included, several have gross difference rates in excess of 15 percent and 
many of these were for items for subgroups of the population. 

The index of inconsistency is not as easily generalized, since the size of this statistic 
is related to the size of the estimate (the denominator of the index is a function of the percent of 
persons with the characteristic). For items which are present in between 20 and 80 percent of all 
persons, the following general rule used by the Census Bureau is reasonable: an item with an 
index of inconsistency less than 20 has a low level of response variance; an item with an index 
between 20 and 45 has a moderate response variance, and; an item with an index over 45 is 
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considered highly inconsistent Using these guidelines, 54 percent of the items included had low 
response variability, 33 percent had moderate response variability, and 13 percent had high 



response variability. 



Items with Large Measurement Errors 

The gross difference rate, the net difference rate, and the index of inconsistency are 
ve often related to each other. An item which has a high estimate for one of the statistics is 
usually found to have at least one of the other two statistics which is larger than average. This 
finding helps in accomplishing the goal of identifying particular survey items that are not very 
reliable. Some of these items which exhibit relatively high measurement errors are discussed 
below. 

Of all the items asked for both preprimary and primary school children, only two 
could be considered to have large measurement errors. The item about how many hours the child 
spent watching television has relatively large index of inconsistency and difference rates. This may 
be due to several factors, including the general ambiguity of the item, the crude measurement scale 
(whole hours) relative to the internal variability in the item, and differing circumstances (32 percent 
of the differences for this item were attributed to the situation changing). 

The other item in this scries which is worth noting is the one about how often the 
child is read to. This item has a large gross difference rate, but moderate index of inconsistency 
and net difference rate. About 27 percent of the difference noted between the original and 
reinterview were attributed to changes in the child's situation. This item had specific pre-coded 
response values which the respondent was asked to use in their response. Nearly all the 
differences reported involved a difference of plus or minus one value of the scale. 

In the preprimary series of questions, the two items that ask whether the daycare 
center or the nursery schooVprekindergarten is a Head Start program have large measurement 
errors. While these items were only asked for 31 (for the daycare centers) and 52 (for the nursery 
schooVprekindergartens) children, all three of the statistics used indicate that the questions have 
response problems. The cause of the response problems for these items may be the parent's lack 
of knowledge about what constitutes a Head Start program. The child's situation changing is not a 
contributor to the response problems for these items. 
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Another related item that had large measurement errors was the one that asked 
parents to classify the program as a nursery school, prekindergarten, or Head Stan program. The 
classification of these preschool programs is not simple and the measurement errors reveal that 
parents may not be able to do this very well. 

The only other item in the preprimary series that showed very large measurement 
errors was the one that asked how often the parent talked with the daycare center provider. The 
same item for children who attended a nursery school/prekindergarten had a large gross difference 
rate, but small net difference rate. The daycare center question was only asked for 33 children. 
One of the problems respondents might face with this item is defining what constitutes talking to 
the provider. Some parents might include conversations with the provider when picking the child 
up at the end of the day while some might restrict it to more formal discussions. 

In the primary school children items, no items were observed to have a large gross 
difference rate, net difference rate, and index of inconsistency. Despite this, three items are worth 
noting mainly because they have a large index of inconsistency. One is the age when the child 
started kindergarten, which has a large gross difference rate and index of inconsistency. This item 
asked parents to give the month and year when the child started kindergarten. Most parents 
probably did not have this date memorized and thus were required to mentally construct the 
answer. This construction could have contributed to much of the problem. 

The other two items that had large indexes of inconsistency were the one that asked 
how often the parent talked to the child about school and the one that asked if any of the child's 
previous daycare programs had an educational program. While the results raise some questions 
about the reliability of these items, the relatively low gross and net difference rates do not indicate 
that substantial problems are present. 



Items Requiring Recall 

About 10 items in the primary school interview and a few items in the preprimary 
school interview asked the parent to recall past activities of the child, such as whedier the child ever 
attended a daycare center. The items concerning retention in kindergarten and primary school are 
discussed in a later section. 
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Except for question about an educational program in the daycare center which was 
discussed earlier, the statistics for the recall items are very similar. The gross difference rates run 
from 5.2 to 9.9 percent, the net difference rates range from -3.3 to 4.5 percent, and the indices of 
inconsistency range from 15.8 to 31.3. These relatively low measurement error statistics indicate 
that the recall items worked well. The items were well-defined for the parents and they typically 
repeated the same response in the reinterview as given in the original interview. This finding 
suggests that limited recall of well-defined and salient activities of children for future 
administrations of NHES are reasonable. 



Enrollment and Retention Items 

About 9 items were asked about children's current or past enrollment or retention in 
kindergarten or first grade and above. The items for preschool arrangements and the item which 
asked when the child started kindergarten, which were already mentioned, are excluded from this 
discussion. 

For virtually all of the enrollment and retention items, the three statistics used to 
approximate measurement errors are very small. In general, parents responded consistently to 
these items over both administrations of the interview. The statistics suggest that the items related 
to enrollment and retention are very reliable. 

The initial item which asks if the child is attending or enrolled in school has larger 
measurement errors than any of the other items of this type. Even for this item, the gross 
difference rate is only 4.5 percent, the net difference rate is 0.8 percent, and the index of 
inconsistency i? 12.7. This item has the same wording as used in the Current Population Survey. 

Response problems for this item, which is asked for all children regardless of their 
age, appear to be associated with almost entirely preprimary school age children. In particular, 
children who are in nursery school or prekindergarten programs may be sometimes classified as 
enrolled while at other times as not enrolled. Sixteen of the 17 response errors were found in the 
197 preprimary interviews. In the NHES:91, this was not a problem since other questions were 
used to direct the flow of the interviews and classify the child. However, these results do indicate 
that the item may have high response errors when used for young children. 
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5 . Summary 

The reinterview program for the Early Childhood Education survey in the NHES:91 
was designed to help identify specific items in the interviews that were not reliable, to quantify the 
response variance for groups of items, and to provide feedback for future administrations of the 
interviews. The reinterview program accomplished all three of these objectives. 

The results of the reinterview are encouraging. Most of the items included in the 
reinterview had small to moderate measurement errors. For the specific items with high 
measurement errors potential problems associated with most of these items included vague or 
ambiguous classifications, and parents' lack of knowledge about the item- 
One of the weaknesses of the reinterview program was its limited scope. Only 
slightly over 500 rcinterviews were conducted and this limits the ability to look more closely at the 
distribution of errors by characteristics of the respondents. For example, the type of analysis done 
by O'Muircheartaigh (1986) on the correlates of response errors can not be measured with a 
sample of this size. In future administrations of the NHES, reinterviews will still be conducted 
using the same basic methods, but the size of the program may be increased if these types of 
analyses are viewed as important 
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